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Abstract 

This two-part paper discusses robustification methodologies for linear-iterative distributed algorithms for 
consensus and coordination problems in multicomponent systems, in which unreliable communication links may 
drop packets. We consider a setup where communication links between components can be asymmetric (i.e., 
component j might be able to send information to component j, but not necessarily vice-versa), so that the 
information exchange between components in the system is in general described by a directed graph that is 
assumed to be strongly connected. In the absence of communication link failures, each component i maintains 
two auxiliary variables and updates each of their values to be a linear combination of their corresponding 
previous values and the corresponding previous values of neighboring components (i.e., components that send 
information to node i). By appropriately initializing these two (decoupled) iterations, the system components 
can asymptotically calculate variables of interest in a distributed fashion; in particular, the average of the initial 
conditions can be calculated as a function that involves the ratio of these two auxiliary variables. The focus 
of this paper to robustify this double-iteration algorithm against communication link failures. We achieve this 
by modifying the double-iteration algorithm (by introducing some additional auxiliary variables) and prove that 
the modified double-iteration converges almost surely to average consensus. In the first part of the paper, we 
study the first and second moments of the two iterations, and use them to establish convergence, and illustrate 
the performance of the algorithm with several numerical examples. In the second part, in order to establish the 
convergence of the algorithm, we use coefficients of ergodicity commonly used in analyzing inhomogeneous 
Markov chains. 
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I. Introduction 

The design of protocols and algorithms for distributed computation and control/decision tasks has 
attracted significant attention by the computer science, communication, and control communities (e.g., 
ID, [HI, [|3l, flU, jSl, O, and references therein). For example, given i) a collection of robots 
moving in the plane, ii) a collection of sensors in a sensor network, or iii) a collection of distributed 
energy resources in an electrical grid, the components may be interested in, respectively, i) agreeing 
on a common direction to follow (this common direction could be provided by a leader robot), ii) 
measurement averaging (with each sensor providing a local measurement of a global quantity), or iii) 
collectively providing a predetermined total amount of active power subject to the constraints of each 
distributed resource. In the control literature, the first and second problems are respectively known as 
consensus and average consensus (see, e.g., JH), whereas the third problem can be considered as a 
distributed resource coordination problem [HI, flU. 

In this two-part paper, we consider multicomponent systems in which each component can exchange 
information with other components in its neighborhood in order to compute, in a distributed fashion, 
some quantity of interest. In our setup, communication links between components (nodes) can be 
asymmetric (i.e., component j might be able to send information to component i, but not necessarily 
vice-versa), a situation that arises in a wireless setting if the transmission power available to different 
nodes are also different. In this setting, the information exchange between components in the system 
can be described by a directed graph which is assumed to be strongly connected. Through an iterative 
process, nodes in the network are required to compute (using only information made available by their 
neighbors) the quantity of interest. In particular, we study linear-iterative algorithms in which each node 
j maintains a value (or a set of values) that is updated to be a weighted linear combination of node j's 
own previous value and the previous values of its neighboring nodes (i.e., nodes that transmit information 
to node j). The main focus of the paper is to develop strategies to robustify the linear-iterative algorithms 
described above against communication links that may drop packets. 

In the context of consensus and average-consensus problems, an extensive literature in the control 
community focuses on the linear-iterative algorithms described above (e.g., [[7]|, |[TOl . [[2||, ([TTI . [[T2|| . 
|[T3l . |[T4l and references therein). These works have revealed that if the network topology satisfies 
certain conditions, the weights for the linear iteration can be chosen so that all the nodes asymptotically 
converge to the same value (even if the network connections are time- varying). Additionally, if the 
interconnection topology is invariant and bidirectional (i.e., if node j can send information to node i, 
then node i can send information to node j), simple techniques can be used to choose the weights of the 
linear iteration so as to ensure that, after running the linear iteration, the nodes will asymptotically reach 
consensus to the average of their initial values ifTTIl . lfT2[| . Other works have looked at the consensus 
and average-consensus problems when the interconnection topology is described by a directed graph. 
In particular, the authors of [[TSl focus on continuous-time linear iterations and state necessary and 
sufficient conditions for a network of integrators to asymptotically reach agreement to a common value 
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(but not necessarily the average of their initial values). Similarly, the authors of lfT2l consider discrete- 
time iterations, and provide necessary and sufficient conditions on the weights that allow the nodes to 
asymptotically reach consensus to the average of their initial values. Additionally, the work in [[T6ll . 
IfTTll discusses how average-consensus can be reached asymptotically with linear-iterative algorithms in 
which the nodes use fixed weights in their linear updates and also develops linear-iterative algorithms 
where the nodes adapt their weights in a distributed fashion so that asymptotically average-consensus 
is reached. In the context of resource coordination, there is some recent work ifTSl . |[8l, flU that also 
focuses on linear-iterative algorithms, similar to those used to address consensus and average-consensus 
problems. Other recent work has addressed the related problem of achieving consensus and average- 
consensus in a multicomponent system where some nodes can exhibit malicious behavior lfT9ll . [|20ll . 
These works assume fault-free communication links, but are related to what we do in this paper in the 
sense that they can be used to handle unreliable nodes (as opposed to links). 

In our development, we adopt a very general model for the communication modality between nodes, 
which allows asymmetric information structures, in the sense that if node i can transmit information to 
another node j, it is not necessarily true that node j can transmit information to node i. We only require 
that each node, apart from seeing incoming transmissions sent to it by neighboring nodes, knows the 
number of nodes that it can transmit information to, which in graph-theoretic terms is referred to as the 
out-degree of that node. In fact, in the proposed algorithm, each node will broadcast the same quantity to 
all receiving nodes, which simplifies the communication scheme between sending and receiving nodes 
(as it is not necessary for each sending node to separately communicate with each receiving node). 

When the communication network is perfectly reliable (no packet drops), the collective dynamics of 
the linear iterations can be described by a discrete-time transition system with no inputs in which the 
transition matrix is column stochastic and primitive. Then, each node will run two identical copies of 
the linear iteration each of which, however is initialized differently depending on the problem to be 
solved.In this paper we mostly focus on the average consensus problem. Under proper initialization, 
it can be shown that each node will asymptotically calculate the desired value as a function of the 
outcomes of the two iterations. The details of these double-iteration approach are provided in [fT6ll . 
IfTTll for the average consensus case and in |[8l, [|9l| for the resource coordination problem. For the 
average-consensus problem, the double-iteration algorithm is a particular case of the algorithm in ||2TI 
(which is a generalization of the algorithm proposed in 1221), where the matrices describing each linear 
iteration are allowed to vary as time evolves, whereas in our setup (for the ideal case when there are 
no communication link failures) the transition matrix is fixed over time. 

The focus of this paper is to robustify the double-iteration algorithm (informally described above and 
formally described in Section HH) so that it can tolerate failures in communication links and converge 
to the average value. Our communication link reliability model assumes that at each time step, a 
communication link is unavailable with some probability. In other words, a packet containing information 
from node i to node j is dropped with some probability. Next we informally describe our robustification 
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approach. Consider two nodes i and j, and assume that j receives information from node i but not 
necessarily vice-versa. Let us refer to j as the receiving node (or receiver) and i as the sending node 
(or sender). An important requirement is for the graph describing the communication network to be 
strongly connected, which implies every node must be able to act both as a sender and as a receiver. 
Then, for each of the two iterations each node performs, node i (the sender) will keep track of the 
following quantities of interest: i) its own internal state (as captured by the state variables maintained in 
the double iteration scheme of [l2T]|. [[TSl : ii) the total mass broadcasted so far (to be described in detail 
soon); and iii) the total received mass from each node / that sends information to node i. Similarly, 
for both iterations, each node j (the receiver) updates the value of its internal state to be a linear 
combination of its own previous internal state value (weighted by the inverse of the number of nodes 
that have j as a neighbor) and the difference between the two most recently received mass values from 
each of its neighbors (also weighted by the inverse of the number of nodes that have j as a neighbor). 
At time instant k, the total broadcasted mass by node j is the sum up to (and including) time step k of 
the sequence of values of node j's internal value, weighted by the inverse of the number of nodes that 
receive values from node j). Additionally, node j updates the value of the received mass from each node 
I that sends information to node j as follows: the received mass from node / is the total broadcasted 
mass sent by node / up to time k if the communication link from node / to node j is available at time 
step k; otherwise, the received mass remains the same as the most recently received mass from node 
/. An implicit assumption here is that messages broadcasted by node / are tagged with the sender's 
identity so that the receiving node j can determine where different packages have originated from. 

Recent work that has addressed the consensus and average-consensus problems in the presence of 
unreliable communication links [|23l . [|24ll . Il25l has employed a communication link availability model 
similar to ours. The work in ll23l assumes that the graph describing the communication network is 
undirected and that when a communication link fails it affects communication in both directions. 
Additionally, nodes have some mechanism to detect link unavailability and compensate for it by rescaling 
their other weights (so that the resulting transition matrix remains column stochastic). Following this 
strategy, the authors show asymptotic convergence to the average of initial conditions and also calculate 
the rate at which the variance of the total deviation from the average converges to zero. The work in 
ll24l does not require the graph describing the communication network to be undirected and proposes 
two compensation methods to account for communication link failures. In the first method, the so-called 
biased compensation method, the receiving node compensates for the unavailability of an incoming link 
by adding the weight associated to the unavailable link to its own weight (so that the resulting matrix 
remains row stochastic). In the second method, called the balanced compensation method, the receiving 
node compensates for link unavailability by rescaling all the incoming link weights so that the resulting 
matrix remains row stochastic. The key in both methods is the fact that at each time step, the resulting 
weight matrix is row stochastic; the authors show that the nodes converge almost surely to the same 
value, but this value is not necessarily the average of the initial conditions. The work in ll25l . which does 
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not require the communication graph to be undirected, proposes a correction strategy that corrects the 
errors in the quantity (state) iteratively calculated by each node, so that the nodes obtain the average of 
their initial values. This correction strategy is based on each node maintaining some auxiliary variable 
that accounts for the amount by which node i changes its state due to the updates from its neighbors, 
i.e., the nodes that can send information to node i. For their strategy to work and ensure that the nodes 
converge almost surely to the average consensus, the authors rely on the nodes sending acknowledgment 
messages and retransmitting information an appropriate number of times. 

In II2TII . the authors proposed a gossip-based algorithm for average-consensus ver a directed graph 
where the transition matrices describing the nodes' collective dynamics change at every iteration step 
(depending on which node awakes). This scheme requires the node that is awake to perform an 
internal state update and send its internal state (weighted by the corresponding out-going link weight) 
to its neighbors. This approach results in generates a sequence of column stochastic matrices (not 
necessarily primitive) with the property that all the diagonal entries remain positive. The authors 
prove that by running two such iterations in parallel, one of them initialized with the values on 
which the average operation is to be performed and the other with the all-ones vector, each node 
will asymptotically achieve average consensus by taking the ratio of the two values in maintains. A 
key premise in their proof is that column stochasticity of the transition matrix is maintained over time, 
which requires sending nodes to know the number of nodes that are listening. This suggests that i) either 
the communication links are perfectly reliable, or ii) there is some acknowledgment and retransmission 
mechanism that ensures messages are delivered to the listening nodes at every round of information 
exchange. In this paper, we remove such assumptions and robustify the double-iteration algorithm against 
unreliable communication links using a pure broadcast-message model without any requirement for an 
acknowledgment/retransmission mechanism. Thus, despite the reliance of our algorithm on the ratio of 
two linear iterations, it is different both in the communication model we assume — a broadcast model 
in our case — and also in the nature of the protocol itself — our focus is on ensuring convergence in the 
presence of communication link failures. 

An additional assumption made in [[2T]| is that the diagonal entries of the transition matrix (at every 
step) remain positive. In our model, we originally consider that nodes do not drop self-packets. However, 
to ease the analysis, we remove this assumption and consider the case where self -packet drops are also 
allowed at every time step, which i) allows us to handle intermittent faults in the node processing 
device, and ii) removes the assumption that all diagonal entries must be positive at every step. Finally, 
the analysis machinery in [|2T1 is quite different from the one used in this paper. We employ moment 
analysis of the two iterations to establish that they are linearly related as the number of steps goes to 
infinity, while [[2T| relies on establishing weak ergodicity of the product of the transition matrices as 
the number of steps goes to infinity. Finally, as it will be shown in the second part of this paper, our 
algorithm can be re-casted into a similar framework as the one in ETl by augmenting the dimension 
of the vector describing the collective dynamics to account for the packets that get dropped once there 
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is a communication failure. Note, However, that the resulting matrices will be column stochastic but 
will not necessarily have strictly positive entries on their diagonals. In the second part of the paper, we 
provide an analysis framework to establish the convergence of our algorithm and generalizes the ideas 
in ETI to the case when the matrices describing the system collective dynamics do not have strictly 
positive diagonals. In this regard, we will show that even in the case where self -packet drops are not 
allowed, the resulting transition matrices might still have zero diagonal entries. 

The remainder of this paper is organized as follows. Section HI] provides background on graph theory, 
introduces the communication model, and briefly describes the non-robust version of the double-iteration 
algorithm we use in this work. Sectionllllldescribes the proposed strategy to robustify the double-iteration 
algorithm against communication link failures and illustrates the use/performance of the algorithms via 
several examples. The convergence analysis of the robustified double-iteration algorithm is provided in 
Section |Vl Concluding remarks are presented in Section |VIl 



This section provides background of graph-theoretic notions that are used to describe the communica- 
tion network and the distributed consensus/coordination setup, introduces the basic communication link 
availability model, and reviews a previously proposed two-iteration algorithm that can be used to solve 
the class of problems addressed in this paper when the communication network is perfectly reliable. 

A. Network Communication Model 

Let discrete time instants be indexed k = 0, 1, ... ; then, the information exchange between nodes 
(components) at each time instant k can be described by a directed graph Q[k] = {V,S[k]}, where 
V = {1, 2, . . . , n} is the vertex set (each vertex — or node — corresponds to a system component), and 
£[k] C V X V is the set of edges, where (j, i) G £[k] if node j can receive information from node i at 
instant k. It is assumed that £[k] C VA; > 0, where £ is the set of edges that describe all possibly 
available communication links between nodes; furthermore, the graph (V, £) is assumed to be strongly 
connected. All nodes that can possibly transmit information to node j are called its in-neighbors, and 
are represented by the set J\f^ = {i E V : (j, i) E £}. Note that there are self-loops for all nodes in Q 
(i.e., (j, j) E £ for all j E V). The number of neighbors of j (including itself) is called the in-degree 
of j and is denoted by Vj = | A/^^ | . The nodes that have j as neighbor (including itself) are called its 
out-neighbors and are denoted by J\f^ = {I eV : E £}; the out-degree of node j is Vj^ = |A/^^|. 

The existence of a communication link from node i to node j can be described in probabilistic terms 
as follows. At instant k, let Xji[k], E V be an indicator variable that takes value 1 with probability 
q and takes value zero with probability I — q, i.e.. 



II. Preliminaries 




(1) 
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Then, for all A; > 0, the existence of a communication link between nodes i and j can be described be 
another indicator variable iji[k] defined as 

It follows that £[k] contains the elements of £ for which iji[k] = Xji[k] = 1. 

B. Double-Iteration Algorithm Formulation in Perfectly Reliable Communication Networks 

When the communication network of a multi-component system is perfectly reliable, i.e., Pr{£jj[/i;] = 
1} = 1, V(j, i) e £, \/k > 0, it was shown in [[T71 , [[91 that the components of the multi-component 
system can asymptotically solve average consensus and resource coordination problems in a distributed 
fashion by running two separate appropriately initialized linear iterations of the form 

zAk + i]= Yl ^^4^], (4) 

where Vj' (T^t) is the out-degree of node j (i). A requirement in all cases is that the underlying 
communication graph {Q, 8) is strongly connected. 

1) Average Consensus Problem: In this problem, the nodes aim to obtain the average of the values 
Vj, j = 1, ... ,72, they each posses. In [[TTl . it was shown that if the initial conditions in ^ (referred 
to as iteration 1) are set to yj[0] = Vj, and the initial conditions in dH) (referred to as iteration 2) are 
set to Zj[0] = 1, then the nodes can asymptotically calculate v := Y^^=i'^j/^ 

V = lim y-M, (5) 

fc^-oo Zj[K\ 

by running the two iterations in ([3]) and @). 

2) Resource Coordination Problem: In this problem, each node j can contribute a certain amount 
TTj > of a given resource, which is upper and lower bounded by known capacity limits -jij^"-^ and 
^min respectively. The challenge is to coordinate the components so that they collectively provide a 
pre-determined total amount pd = Yl]j=i of the resourcqll as specified by an external "leader." In [0, 
it was shown that i) if the initial conditions in ^ are set to yj[0] = pd/m — vr™" if j is an out-neighbor 
of the leader (where m > 1 is the number of nodes contacted initially by the external leader) and 
?/j[0] = — TT™™ otherwise, and ii) if the initial conditions in dU are set to ^^[0] = tt™"^ — tt™*", then the 

'in the development in 1181 . (9j, it is assumed that X]"=i t^^^^ ^ Pd ^ X]"=i ^7""^' ''^'^ ^ restrictive assumption because in the 

proposed algorithms, all nodes will be able to know if pd < X]j=i i"™'" or if pd > X]j=i ^7*"^ (which means that no feasible solution 
exists). 
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nodes can asymptotically calculate their own resource contribution nj as 



71 



lim ( 



A;— >oo 



which satisfies 



Zj[k] 



'max 



^mm 



)) 



„mm 



En 
1=1 



mm 
I' I 



'max „mm\ 



(6) 



„mm ^ _. ^ „m.ax \j ■ 



(7) 



In this paper, we start with a double iteration of the form in ©-dH) that is used for either aver- 
age consensus or coordination, and develop systematic methodologies to handle packet drops in the 
communication links. 



III. ROBUSTIFICATION OF DOUBLE-ITERATION ALGORITHM 

In this section, the algorithm described in Section III-BI is modified so as to make it robust against 
communication link failures. As in ©-(SI), each node will run two iterations (which we refer to as 
iterations 1 and 2) to calculate quantities of interest and eventually solve the average consensus or 
coordination problems. 

Consider the setup described in the previous section: we are given a (possible directed) strongly 
connected graph {Q, £) representing a multicomponent system and its communication links between its 
components. For the sake of generality, let us refer to j as the receiving node (or receiver) and i as 
the sending node (or sender). For each of the two iterations, node i (the sender) will calculate several 
quantities of interest, which we refer to as: i) internal state; ii) total broadcasted mass; and iii) total 
received mass from each in-neighbor / of node i, i.e., for each node / G J\f^. For both iterations 1 and 2, 
each node j updates the value of its internal state to be a linear combination of its own previous internal 
state value (weighted by the inverse of the number of nodes that have j as a neighbor, i.e., l/V^) and 
the sum (over all its in-neighbors) of the difference between the two most recently received mass values. 
At instant time k, the total broadcasted mass is the sum up to (and including) step k of the weighted 
value of node j's internal state (used to update the internal state of node j). Additionally, node j (the 
receiver) updates the value of the received mass from node / to be either the total broadcasted mass sent 
by node i if the communication link from i to j is available at instant k, or the most recently received 
mass value from node i, otherwise. An implicit assumption here is that messages broadcasted by node 
i are tagged with the sender's identity so that the receiving node j can determine where messages 
originated from. 

For iteration 1, let yj[k] denote node j's internal state at time instant k, iJ.ij[k] denote the mass 
broadcasted from node j to each of its out-neighbors / (this is a single value ad it is the quantity is 
the same for each out-neighbor / of node j, i.e., for each / G J^f), and i^ji[k] denote the total mass 
received at node j from node i G A/^^. Similarly, let Zj[k] denote node j's internal state takes at time 
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instant k, o-ij[k] denote node j's broadcasted mass for each out-neighbor /, / G J\f^, and rji[k] denote 
the total mass received from node i E . Then, the progress of iteration 1 is described by 

yj[k + 1] = ^y,[k] + {u,,[k] - u,,[k - 1]), A; > 0, 



J j=0 1 



where 



Vji[k] 



/i,,[/c], if{],i)E£[kl k>0, 
v,Ak-ll ii{j,i)i£[kl k>0. 



Recall that Vj^ (T^t) the number of nodes that node j (i) can transmit information to. Similarly, the 
progress of iteration 2 is described by 

1 1 

aij [k] = ai, [k-l] + --Z, [k] = ^ — [z] , k> 0, (9) 



where 



Tji[k] 



J i=0 3 



aji[kl ii{j,i)e£[kl k>0, 
rj,[k-l], ifij,t)^£[k], k>0. 



As mentioned earlier, for solving the average consensus problem, the initial conditions in ^ are 
set to yj[0] = Vj, whereas the initial conditions in ^ are set to Zj[0] = 1. Similarly, for solving the 
resource coordination problem, the initial conditions in ^ are set to yj[0] = p^/m — tt™" if j is initially 
contacted by the leader and yj[0] = — vr™*" otherwise, whereas the initial conditions in ^ are set to 
Zj[0] = vr™"^ — vr™" > 0. In both the average consensus and coordination problems, /ijj— 1] = and 
1] = for all (j, i) G £, and crjj[— 1] = and t^J— 1] = for all (j, i) G £. 

Main Result: We shall argue that with the proposed robustification strategy, despite the presence of 
unreliable communication links (at each time step, each link (j, i) G £, fails independently from other 
links and independently between time steps, with some probability 1 — qji), nodes can asymptotically 
estimate the exact solution v to the average consensus by calculating, whenever Zj[k] > the ratio 
yj[k]/zj[k], i.e., 

k^oo Zj [k\ 
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whenever Zj [fc > 0] . Similarly, exact solution to the resource coordination problem can be obtained as 

whenever Zj[k] > 0. In both cases, we run the iterations in ([8]) and Q and using the corresponding initial 
conditions as outlined above. In particular, we will show that, for every j, Zj[k] — ^""^ W ~^ 
as A; — oo almost surely. Additionally, we will show that Zj [k] > occurs infinitely often. 

A. Examples 

We now illustrate how the proposed algorithm works for the case of average consensus in the 
presence of packet-dropping communication links. We start with the rather small network shown in 
Fig. [T] and assume that the packets on each link (including the self-links which are not drawn in the 
figur^ can be dropped with probability 1 — g, independently between different links and independently 
between different iterations. We also assume that the initial values of the five nodes are given by 
V = [—4, 5, 6, —3, 1]^, with their average equal to 1. Thus, in the iterations ^ and ^ 

y[0] = [-4, 5, 6, -3, 1]^, and ^[0] = [1, 1, 1, 1, 1]"^ , 

with = Vji[-1] = o-jil-l] = Tji[-1] = for all G £. 

We run the iterations in ^ and ^ and plot the ratio ^4n a function of the iteration step k for each 
node j = 1, 2, 3, 4, 5). Figure [2] shows the typical behavior that we observe for q = 0.99 (i.e., for a 
probability of a packet drop equal to 0.01). As can be seen in the figure, the ratio at each node quickly 
converges to the correct average, though the individual values for yj[k] and Zj[k] do not converge. 

In Figs. [3] and |4] we show typical behaviors of the same multicomponent system for q = 0.5 and 
q = 0.1. The behavior remains similar to the one observed before: even though yj[k] and Zj[k] do not 

^We make this assumption later in the paper for the purposes of simplifying notation. 




Fig. 1. Small directed graph used for illustration of the ratio algorithm for obtaining average consensus in the presence of packet-dropping 
communication links. 
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V al Each Node vs lie rations 



Each Node vs llerations 





Ralio al Each Node vs lleralions 




(a) 



(b) 



(c) 



Fig. 2. Evolution of tlie values of yj[k] (left), zj[k] (middle) and -^-ny (right) for q = 0.99 



[k] 



y at Each Node vs Iterations 



z at Each Node vs Iterations 



Ratio at Each Node vs Iterations 






(a) 



(b) 



(c) 



Fig. 3. Evolution of the values of yj[k] (left), Zj[k] (middle), and ^ (right) for q = 0.5, j = 1, 2, 3, 4, 5 



converge (in fact, they seem to behave more radically with decreasing q), the ratio ^44 does converge 
to the average of the initial values. Note that the plot of the ratio in Fig. |4] is quite different than the 
rest: in this case, q is small enough so that Zj[k] (and simultaneously yj[k]) can take the value zero 
(e.g., when all packets destined to node j are dropped at iteration k); thus, the ratio in such cases is 
not defined and is not plotted, resulting in a discontinuous set of points in the plot. Nevertheless, we 
can see that when packets are received (which happens frequently enough for each node), the ratio has 
the correct value. This is a point addressed later in the paper. 

An example of what happens in larger graphs is shown in Fig. \5\ Here we consider a graph with 50 
nodes, randomly generated by choosing a directed edge from node i to node j, 1 < i, j < 50, i ^ j, 
independently with probability 1/2, and ensuring that the resulting graph is strongly connected. As can 
be seen the behavior remains similar to what we observed for the smaller graph: the ratio |^ converges 
quickly to the average even though the individual yj[k] and Zj[k] do not converge. For this particular 
plot, we used q = 0.1, which also justifies the fluctuation in the values of yj[k] and Zj[k]. 
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V ai Each Node vs lie rations 



Each Node vs Iterations 




(a) 




Ratio at Each Node vs Iterations 



(b) 



Fig. 4. Evolution of the values of yj[k] (left), zj[k] (middle), and (right) for g = 0.1, j = 1, 2, 3, 4, 5. 
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(c) 



y at Each Node vs Iterations 



z at Each Node vs Iterations 



Ratio at Each Node vs Iterations 




(a) 




(b) 




Number of Iteralioi 



(c) 



Fig. 5. Evolution of the values of yj[k] (left), Zj[k] (middle), and ^^j^ (right) for g = 0.1 for a 50-component system. 



IV. First and Second Moment Analysis 

In this section, we obtain recurrence relations that describe the first and second moment of the 
iterations after ([8]) and this analysis is used in Section |V] to establish the claims in (flOl) and (fTTI) . In 
order to ease the moment calculations, the expressions in ([S])-® will be rewritten more compactly in 
vector form. Also, in order to facilitate notation, we will allow each node j to drop the packet carrying 
its own previous value when updating its value. This way, node j handles its own value in the same 
way as its neighbors' values and notation is simplified significantly. 



A. Vectorized Description of Double-Iteration Algorithm 

Using the definition for the indicator variable Xji[k] given in ([T]) and the resulting indicator variable 
iji[k] given in which describes the successful transmission of information from node i to node j 
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over an existing, unreliable communication link, iterations ([8]) and ^ can be rewritten as 

r ^^,^[k - 1] + ^y,[k], iflE^^;, k>o, 

l^ij[k] = < ^ ^ (12) 

[ 0, if/^Ar+, k>0, 

^ ll^i I f^ii[^]^Ak] + i'ji[k-l]{l-Xji[k]), ifieAff, k>0, 
'"^ ^ \ 0, if z^J\f-, k>0, 

n 

yj[k + l] = Y, (^i^W - ^J^y^ - 1])' ^ ^ 0' (14) 



i=l 



and 

[ 0, if/^Ar+, A;>0, 

r \k] = [ t^]^^" + - 1] ( 1 - ^J-* W ) , if ^ e ATr , A; > 0, 
^ \ 0, if i e A/'r, A; > 0, 

n 

+ 1] = 5^ - r,4fc - 1]), > 0, (17) 

i=l 

where = = = Tj,[-1] = 0, Vj,2. 

Let Ao B denote the Hadamard (entry-wise) product of a pair of matrices A and B of identical size. 
Then, for all A; > 0, iteration ([T2)) - (fT4)) can be rewritten in matrix form as 



Mfc = Mfc_i + Pdiag(2/fc), (18) 
Nk = MkoXk + iVfe.i o {U - Xfc), (19) 
= (iVfc - iVfc_i)e = [(Mfc - iV,,_i) o Xfc] e, (20) 

where P = [pji] e M"""", with = Vj G A/^+ and pji = otherwise; M_i = iV_i = 0; = 

i 

U E M"^", with [Uji] = 1, diag(i/fc) is the diagonal matrix that results by having the entries of 

yk on the main diagonal; and e = [1, 1, . . . , 1]' (note that U = ee^). Similarly, for > 0, (fTSlj-ifTTI) can 
be rewritten in matrix form as 

5fc = 5fc_i + Pdiag(zfc), (21) 
n = SkoXk + Tfc_i o {U - Xfe), (22) 
Zk+i = (Tfc - Tfc_i)e = [{Sk - Tfc_i) o Xk] e, (23) 

where S^i = T_i = 0, Zk = z[k], and diag(2;fc) is the diagonal matrix that results by having the entries 
of Zk on the main diagonal. 

By defining Ak := — Nk-i and Bk := 5*^ — Tfc-i, iteration (fT8l)-(|20l) can be rewritten more 
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compactly as 

Afc = Afc_io(f/-Xfc_i)+Pdiag(yfc), k>l, (24) 

Vk+i = (Ak o Xk)e, k>0, (25) 

and iteration dlB-dlS]) as 

Bk = o {U - Xk-i) + Pdiag(z,), k>l, (26) 

Zk+i = {Bk o Xfc)e, /c > 0, (27) 

where = Mq - A^_i = Pdiag(?/o), and Bq = So - T_i = Pdiag(zo)- 

For analysis purposes, each matrix in (|24l) -(l25l) and (I26l)-(l27l) will be rewritten in vector form by 
stacking up the corresponding columns |^ Then, (l24l)-(|25l) and (|26l)-(|T7l) can be rewritten in vector 
form as follows. Let F = [In In ■ ■ ■ In] ^ M"^"\ where /„ is the n x n identity matrix, and P = 
[EiP'^ E2P^ . . . EnP^f e M"'^", where Ei G M"''" has Ei{i, i) = 1 and all other entries equal zero. 
[The entries of EiP^ E M"^*^ {PEj = PEi) are all zero except for the z*'^ row (column) entries, which 
are those of the i*'' row (column) of matrix P^ (P).] Then, (l24l)-(|25l) can be rewritten as 

Ofc = afc„i o (u - Xk^i) + Pyk, /c > 1, (28) 
Vk+i = F{ak oxk), k>0, (29) 

2 2 2 

where G M" , G M" , and Xk^i G result from stacking the columns of matrices Ak, Xk, and 



Xk-i, respectively. Similarly, (I26l)-(I27|) can be rewritten as 

h = h-i o (n - Xk-i) + Pzk, k>l, (30) 
Zk+i = F{bk o Xk), k>0, (31) 

2 

where bk G M" results from stacking the columns of matrix Bk- 

Remark 1: It is important to note that matrices Ak and Bk, and their corresponding vectors and 
bk, have some entries that remain at zero for all A; > 0. Specifically, the (j, i) entry of matrices Ak and 
Bk (and their corresponding entries in and bk) remain zero if there is no communication link from 
node i to node j, i.e., (j, i) ^ £. The reason we keep these entries (despite the fact they are zero and do 
not play a role in the analysis) is because they facilitate matrix notation and calculations in subsequent 
developments. □ 

Since it will appear later at several points of the analysis, it is worth noting that when premultiplying 
P by F, we recover the matrix P, i.e., 

P = PP. (32) 

^If we let A = [Aij] G R"""", then a = [An, A21, . . . , A^i, A12, A22, ■ ■ ■ , A,^2, . ■ ■ , Ain, A2n, ■ ■ ■ , A^n]^ . Vectors defined by 
stacking the columns of a matrix will be denoted with the same small letter as the capital letter of the corresponding matrix. 
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B. First Moment Analysis 

In this section, we describe the first moment dynamics of (|28l)-(l3T|) via discrete-time transition systems 
with no inputs, where (as shown below) the corresponding transition matrices (which are obtained from 
P and q) are column stochastic and primitive. In both iterations, the sum of the entries of the first 
moment vectors for yk and Zk is shown to remain constant over time and be respectively equal to the 
sum of q^iyoii) and q^iZQ{i). Furthermore, both first moments E[i/fc] and ^[zk] are shown to reach 
a steady-state value as k goes to infinity. The above discussion is formalized in the following lemma. 

Lemma 1: Let a^, bk, yk, and Zk be described by the recurrence relations in (|28l) - (|29l) . and (l30l)-(l3TI) 
respectively. Let the first moments of a^, yk, bk, and Zk (i.e., E[afc], 'E[yk], E[6fc], and E[za,]) be denoted 
by Hk, yk, bk, and Ik respectively. Then the evolution of ak, Vk, bk, and Zk, V/c > 1, is governed by 



ak = [qPF + (1 - g)/„2]afc-i, (33) 

yk+i= [qP + {1 - q)In]yk, (34) 

bk= [qPF + {l-q)I,,2]bk-u (35) 

Zk+i = [qP + (1 - q)In]zk, (36) 



where is the m x m identity matrix, with Hq = Pyo, y^ = qPyo, bo = Pzq, and = qPz^. 

Proof: Since the development for obtaining and y^, is parallel to that for obtaining bk and Zk, 
our analysis focuses on the first case. For A; = in (l28l)-(|29l). by taking expectations of both sides and 
noting that packet drops at time step A; = are independent of the initial values for oq, it follows that 

ao = Pyo, (37) 
Vi = qFao. (38) 

Substituting (|37|) into (l38l) . we obtain y^ = qPFy^ = qPlj^. 

For A; > 1 in (|28])-(|29l), noting that packet drops at time step k are independent of previous packet 
drops and the initial values of a^, it follows, by taking expectations on both sides, that 

ak = ttk-i o{u- Xfc_i) + Pyk = ttk-^i o (n - Xk^i) + PVk = (1 - q)ak~i + PVk, (39) 
Vk+i = F{ak o Xk) = Fiak o Xk) = qFak. (40) 

Substituting (gO]) into ^9^, we obtain 



afc = (1 - q)ak-i + qPFak-i 
= [qPF + (1 - g)/„2]afc_i, 



(41) 
(42) 
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Similarly, substituting (1391 ) into (1401) . we have 

Vk+i = (1 - q)qFak^i + qFPy^ (43) 

= {l-q)yk + qFPyk (44) 

= [qP + {l-q)In]yk, (45) 
where /„ is the n x n identity matrix. ■ 

C. Second Moment Analysis 

In the order to calculate the second moment dynamics for (|28l) - (|3T1) . we utilize in the following 
lemma. 

Lemma 2: Let x, c and d be random vectors of dimension n. Furthermore, assume that the entries of x 
are Bernoulli i.i.d. random variables such that Pr{xj = 1} = g and Pr{a;j = 0} = 1 — g, Vi = 1, 2, ... n, 
and are independent from c and d. Then 

S ■.= E [{c ox){xo df] = E[cd^] + q{l - g) E [diag(crf^)], (46) 
T := E [(c o x) {{u ~x)odf] = g(l - q) E[cc/^] - g(l - g) E [diag(crf^)], (47) 

where diag(c(i-^) is a diagonal matrix with the same diagonal as matrix cd^ . 
Proof: The j), i 7^ j, entry of S can be obtained as follows: 

Sij = E [ciXidjXj~\ . (48) 

Since Xj and are pairwise independent, and independent from c and d, it follows that 

E [ciXidjXj] = g2 E [cid^] . (49) 

For i = j, observing that E[xiXi] = E[xi] = g, Vz = 1, . . . , n, we obtain the corresponding entry of S 
as 

S'ii = E [dXidiXi] = E [cirfjXi] = g E [ddi] . (50) 

In (|49l) . it is easy to see that E [cjrfj] is the (z, j) entry of E[c(i^]. Similarly, in (|50l ), it is easy to see 
that E [cidi\ is the (z,z) entry of E[crf^]. From these observations, the result in (l46l) follows. 
Similarly, the i 7^ j, entry of T can be obtained as follows: 

Tij = E [ciXidj{l - Xj)] . (51) 

Since Xj and (1 — Xj) are independent, it follows that 

E [axidjil - Xj)] = E [adj] E [xi{l - Xj)] = g(l - g) E [adj] . (52) 
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For i = j, and observing that E[xj(l — Xj)] = 0, Vz = 1, . . . ,n, the corresponding entry of T can 
obtained as follows; 

Tu = E [c,Xidi{l - X,)] = E [adi] E [xi{l - Xi)] = 0. (53) 

The result in (|47]) follows from ^ and (gS]). ■ 

The following lemma establishes that the evolution of E[afca^], E[6fc6^], and E[afe6^], and can be 
expressed as linear iterations with identical dynamics but different initial conditions. Similarly, the 
evolution of E[?/fc|/J], E[zfc2;J], and E[ykz'J^] can also be expressed as linear iterations with identical 
dynamics but different initial conditions. 

Lemma 3: Consider the second moments of Ofc, yk, b^, and Z)^, and let E[afca|^], E[yfei/J], E[6fe6|^], 
E[2;fc2;J], Elakbl], and E[yfc2;J]) be denoted by r^, ^k, A^, Ek, and Tfc respectively. Then, the 
evolutions of F^, ^fc, A^., E^, T^, V/c > 1, are described by the following iterations (where all / 
denote x identity matrices): 

Ffc = [qPF +{l-q)l] F^^i [qPF + {1 - q)lf + q{l - q)[I - PF]diag(Ffc„i) [/ - Ppf, (54) 

= F [q^Tk + g(l - g)diag(Ffc)] F^, (55) 

= [gPF + (1 - g)/]^fc_i [gPF + (1 - g)/]^ + g(l - q)[I - PF]diag(^,_i)[/ - PFf, (56) 

Afc+i = P [g'^fc + q{l - g)diag(^fc)] P^, (57) 

Ek = [qPF + (1 - g)/]H,„i [qPF + (1 - g)/]^ + g(l - g)[/ - PP]diag(Hfc„i)[/ - PPf, (58) 

Tfc+i = P [g^H, + g(l - g)diag(Sfc)] P^, (59) 

with initial conditions 

Fo = PyoVoP'^, (60) 

<f 1 = ViVl + g(l - q)Fdmg{Pyoy^P'')F^, (61) 

^0 = Pzoz^P'', (62) 

Ai = zizf + g(l - g)Pdiag(Pzo^(r^'^)^^, (63) 

Ho = Pyoz^P^, (64) 

Ti = mzl + g(l - g)Pdiag(P|/o^o'^P'^)i^'^. (65) 

Proof: The derivation of dSl), ([55]), dlQl), and dlB, is the same as the derivation of ^B, dSV]), (l62l) . 
and (l63l) . thus the developments in the proof will only address the former. For k = 0, it follows from 
Lemma [Hand (l28l) that ao = -Pyo- Then, 

Fo = E[aoa^] = P E[yoyo'^]P^ = PyoVoP^, (66) 



COORDINATED SCIENCES LABORATORY TECHNICAL REPORT UILU-ENG-1 1-2207 (CRHC-11-05) 



18 



and 

$1 = E[yiyJ] = E [F(ao o xo)(xo o a^fF^] = FE [(ao o xo)(xo o aof]F^. (67) 

Applying the results in Lemmas \T\ and [21 to (l67l) . it follows that 

$1 = q^FE [aoa^]F^ + q{l - q)FE [diag(aoa^)] F'^ 

= {qFPyo){qFpyof + q{l - q)FE[dmg{Pyoy^P^)]F^ 
= {qPyo){qPyof + q{l - q)Fdmg{Pyoy^ P^)F^ 

= y^ti + g(l - q)FAmg{Py^y^P'^)F^, (68) 

where we used the fact that FP = P (refer to Eq. (|32|) ) . 

For k > 1, and taking into account that y^ = F{ak-i o Xk-i), it follows that 

Tfc = E [(afc„i o{u- Xfc_i) + Pyk) (flfc-i o (u - Xfe_i) + PyuY] 

= E [{ak-i o{u- Xk-i)) (afc_i o {u - Xfc_i))^] + E [(ofc-i o (u - Xfc_i)) {Pyk}'^] 

+ E [{Py,) (afc„i o {u - x,^,)f] + E [{Pyk) [Pykf] 

= E [{ttk-i o{u- Xfc_i)) (afc_i o (m - Xfc„i))^] + E [(afe_i o (m - Xfc_i)) (afc_i o x^^i)'^] F^P^ 

+ PfE [{ak-i o Xfe.i) (afc_i o{u- Xk-i))^] + PFE [{ak-i o Xk-i) (a^-i o F^P^. 

(69) 

Then, from Lemma [2l (l69l ) can be rewritten as 



Tfc =(1 - g)2 E [afc„ia^_,] + g(l - g) E [diag(a,._ia^_i)] 

+ q{l -q)E [akWk.i]F^P^ - q{l -q)E [diag(a,„iaLi)]F^P^ 

+ q{l - q)PFE [afc_iaLi] - g(l - g)PPE [diag(afc„iaf_i)] 

+ g^PPE [ak-ial_^]F^P'' + g(l - g)PPE [diag(afc_ia^_i)]P^P^. (70) 

By re-arranging terms in (iTOl) and observing that Vk-i = E \ak-ia^_^ and diag(rfc_i) = E [diag(afc_ia^_5^ 
the result in (|54|) follows. 

Additionally, from Lemma |2l it follows that 



$fe+i = E [yfe+iy^+i] = PE [(afc o Xfc)(a;fc o akf]F 



T 



q^ E [akal] + g(l -q)E [diag(afca^ 



F' 



P[g2r, + g(l-g)diag(r,)]P^. (71) 
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To obtain the iterations for Ek and T^:, the developments are very similar to the ones above. For 
k = 0. 

So = E [aob^] = P E[2/o4]P^ = Pyo^P, (72) 

and 

Ti = E[2/izf] = E [F{ao o xo)(xo o F^] = FE [(ao o xo)(xo o 60)^^^ 
= {qFPyo){qFPzof + g(l - q)FE[dmg{PyoZ^P^)]F'' 
= iqPyo)iqPzof + g(l - q)Fdiag{PyoZ^ P^)F^ 

= y{zl + g(l - g)Fdiag(Pyo2o'^P'')P^, (73) 

where again we used the fact that FP = P (refer to Eq. (|32|) ) . 

For k > 1, from Lemma [2] and (|28l) . and taking into account that y^+i = F(ak o Xk) and 2:^+1 = 
F{bk o Xk), it follows that 

Sfc = E [(ofc-i o (m - Xfe-i) + Pyk) {h-i o (m - Xfc_i) + Pzk)'^] 

= E [(ofc-i o (u - Xfe_i)) o (n - a;fc_i))^] + E [{ak-i o (u - Xfc_i)) {Pzk)^] 

)f]+E[{Pyk){Pzkf] 
= E [(afc_i o{u- Xk-i)) {bk-i o (?i - Xfc_i))^] + E [(0^-1 o {u - Xk~i)) {bk-i o Xfc_i)'^]F'^P'^ 

+ PP E [{ak-i o Xfc_i) o{u- Xk-i))^] +PFE [{uk-i o Xk-i) {bk-i o P^P^ 
= (1 - qy E [afc_i6[„,] + g(l - g) E [diag(afc„i6Li)] 

+ g(l - g) E [a,_i6Li]i^^^^ - - E [diag(afc_i6Li)]i^'^^^ 
+ g(l - g)PPE [ak^ibl^,] - q{l - q)PFE [diag(afc_i6ri)] 

+ g^PPE [ak^,bt^]F^P^ + g(l - g)PPE [diag(afe_iCi)]^'^^'^- (74) 

By re-arranging terms in (l74l) and observing that S^-i = E [ak-ibl_^] and diag(Sfc_i) = E [diag(afc_i6|l^ 
the result in (|58l) follows. Finally, 



P^ 



Tfe+i = E [yk+izl^i] = PE [{akOXk){xk o bkf]F^ 

= P [g2 E [afc^a + g(l - g) E [diag(afc6r)] 
= P [g^Sfc + g(l - g)diag(Hfc)] P^, (75) 

which completes the proof. ■ 
Although omitted in the statement of Lemma [3l it is easy to see that the dynamics of A^. = E[bka1] 
and Qk = ^[zkUk] ^1^° ^e obtained by noting that = '^1 and Qk = T^. 
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V. Convergence Analysis of Robustified Double-Iteration Algorithm 

The previous Section established that the iterations governing the evolution of F^, \E'fc and H^. are 
identical except for the initial conditions. We will show next that the steady- state solutions of these 
iterations are also identical up to a multiplicative constant. To see this, we will rewrite (l54l) . (l56l) . 
and (l58l) in vector form using Kronecker products. For given matrices C, A, and B of appropriate 
dimensions, the matrix equation C = AXB (where X is an unknown matrix) can be rewritten as a set 
of linear equations of the form (B^ A)x = c, where x and c are the vectors that result from stacking 
the columns of matrices X and C respectively, and (g) denotes the Kronecker producj^ of matrices [|26l . 
Let 7fc be the vector that results from stacking the columns of F^ and 7^ the vector that results from 
stacking the columns of diag(Ffc). Then, it can be easily seen that (l54l) can be rewritten as 

7fc =[[qPF + (1 - q)I] ® [qPF + (1 - g)/]]7fc-i + - q)[I - PF] ®[I - PF]\%^i, k>l. 

(76) 

Let G be a diagonal matrix with entries G{{1 — l)n^ + — 1)71^ + /) = 1, V/ = 1, 2, . . . , n^, and zero 
otherwise. Then, the second term on the right hand side of (l76l) can be written as 

g(l - q){[I - PF] ® [/ - PF])7fc_i = g(l - q){[I - PF] ® [/ - PF])G7,„i, k>l, (77) 

which leads us to 

7fc = [[qPF + (1 - q)I] ® [qPF + (1 - q)I] + q{l - q){[I - PF] ® [/ - PF])G]7fc-i, k>l. (78) 

Let 4'k and and 5k be the vectors that result from stacking the columns of "^k, Sfc and respectively. 
Then, it is easy to see that the same recurrence relation as in (l78l) governs the evolution of and ^k- 
Theorem 1: Let P E M"^" be a column stochastic and primitive weight matrix associated with a 
directed graph g = {V, £}, with V = {1, 2, . . . , n} and £ C V x V. Let F = [J„ J„ ... /„] e M"^"', 
where /„ is the n x n identity matrix, and P = [EiP^ EaP^ . . . Er,P^]^ G where each 

Ei G M"^", i G {1,2, .. . ,n}, satisfies Ei{i,i) = 1 and has all other entries equal to zero. Then, for 
any g, < g < 1, the matrix H defined as 

n = [qPF + (1 - q)I] ® [qPF + (1 - q)I] + q{l - q){[I - PF] ® [J - PP])G (79) 

is column stochastic, and it has a single eigenvalue of maximum magnitude at value one. 

Proof: We show first column stochasticity of matrix H. Let C = qPF + (1 — g)/ and D = I — PF, 
so that n = Ci^C + q{l — q){Di^D)G. We will establish that C®C is column stochastic and also show 

■•The Kronecker product of matrices A = [a,j] £ R^x" and B = [bij] G R*"^' is defined (see, e.g., |26l) as the block matrix 

aiiB . . . ai„B 



A®B 



dmlB . . . amnB 
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that the column sums of D®D are all zero. By construction, the entries of the i*^ column of P G M" ^" 
are all zero, with the possible exception of the ones indexed by ((i — \)n + j, z), z, j = 1, 2, . . . , n, 
each of which corresponds to the (j, i) entry of matrix P. Then, it follows that Yl^=i = Sj=i ^ 
l,Vi = 1,2, .. . The matrix PF G M"^^"^ is also column stochastic by construction, as it results 
from horizontally concatenating n times the matrix P, i.e., PF = [P P ... P]; therefore, the matrix C 
is also column stochastic. The kronecker product of C with itself, results in an x ra^ block matrix of 
the form C ® C = [Ci C2 ■ ■ ■ Crfi], where Cj = [cijC^ C2jC^ . . . c„2jC^]^. Then, it follows that the 
sum of the entries of the l^^ column of Cj is X]m=i /) = (XlILi '^u)(Z]r=i ^^z)- Since Yl'i=i'^ij 

and ^^^=1 ^ri are the sum of the entries of the and l^^ columns of C = qPF + (1 — g)/„4 (which is 

4 

column stochastic), it follows that ^m=i^j ("^'0 = 1' therefore, C C is also column stochastic. 

Since PF is column stochastic, the column-sums of D = I — PF are zero. The kronecker product 
of D with itself is of the form D ® D = [Di D2 . . . 0^2], where Dj = [dijD'^ d2jD'^ . . . dn^jD'^Y . 
Using similar arguments as above, it follows that X]m=i = (Yll=i^i3){Yl^=i^ri) = 0, which 
implies that the column-sums of D D are zero. The only thing left to establish that 11 is column 
stochastic is to show that all entries of 11 are nonnegative (from where it immediately follows that 
n = C ® C + q{l — g)(-D ® D)G is column stochastic). We argue nonnegativity of 11 as follows: 
due to the sparsity structure of G in (1791) . the only nonzero entries of {D ® D)G will be in columns 
{k — l)n'^ + k, k = 1,2, . . . , n^; thus except for entries in these columns, the entries of 11 will be identical 
to the corresponding entries in C ® C. From the structure of PF, entries of C ®C and q{l — q){D D) 
can, respectively, take one of the following three forms: 

{qpij + (1 - g)) {qpim + (1 - q)) , (80) 

q{l-q){l-p,j){l-pij, (81) 



or 



qpij{qpim + (1 - q)), (82) 

- g(l - q)pijil - pim), (83) 



or 

,2 



q PijPlm, (84) 

g(l - q)pijPim, (85) 

where Pij > and pim > are the and (/,m) entries of matrix P. For (|80l ) and (|8T| ), the 
corresponding entry of 11 is of the form 

{qPij + (1 - q)) {qpim + (i - q)) + g(i - - - Pim) = qpijpim + (i - q), (86) 
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and satisfies < qpijPim + (1 — Q') < 1 For (|82|) and (l83l) . the corresponding entry of 11 is of the form 

qpij{qpim + (1 - q)) - g(i - q)Piji^ - Pim) = qpijPim, (87) 

and satisfies < qpijPim < 1- For (f84l) and (l85l) . the corresponding entry of 11 is of the form 

q^PijPlm + g(l - q)PijPlra = qPijPlm, (88) 

and satisfies < qpijPim < 1- 

To prove the second assertion, we will show first that matrix PF can be written via a permutation 
of its indices in the form 



U V 
w 



(89) 



where U is an irreducible column stochastic matrix and lim^^oo = 0. Since PF is column stochastic, 
we can assume that it corresponds to the weight matrix of some graph Q = {V,£}. We will show that 
this graph has a single recurrent class plus a few transient states, from which the decomposition of PF 
in dlH) follows. Let 

V = {(1, 1), (2, 1), . . . , (n, 1), (1, 2), (2, 2), . . . , (n, 2), . . . , (n, n - 1), (1, n), (2, n), . . . , (n, n)}. (90) 

From the structure of PF, it follows that for any node G V, one-step transitions out of are 
to nodes of the form {m,i), with i E J\f~, where J\f^ is the set in-neighbors of node m in the graph 
Q (with weight matrix P). From the structure of PF, it also follows that there are possibly several 
rows of PF with all entries equal to zero, which means that a node (i, j) that is associated with such 
row cannot be reached from any other node; however, as already argued, from nodes of the form 
it is possible to reach nodes of the form {m,i), where i E M^. Clearly, the nodes corresponding to 
rows with all entries being zero are transient. Note that the possibility of individual nodes that cannot 
be reached from any other node being disconnected is ruled out as it is easy to see the only nonzero 
diagonal entries of PF correspond to diagonal entries of P, which are strictly smaller than one. 

Next we will show that from a node (z, j) whose corresponding row in PF has some nonzero entries 
one can reach any other node (m, /) whose corresponding row in PF has some nonzero entries. This 
means that all non-transient nodes form a single recurrent class (as already argued all nonzero diagonal 
entries are strictly smaller than one which means there cannot be absorbing nodes). This follows from 
the fact that the graph Q is strongly connected, which means that for any j, / E V, there exists a path 
between j and I. Let ii,i2, . . . ,it denote the nodes traversed along the path between j and /. We will 
show next that for any two non-transient nodes (r, I) eV there exists a path. As already argued, 

from one can reach in a single hop any node of the form (m, i), where m is a neighbor of node i 
in the graph Q. Since ii is the first node traversed in the path between j and /, it follows that (ii, i) eV 
can be reached in one step from By repeatedly using this argument, it follows that the sequence 
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of nodes (zi, i), {12, ii), ■ ■ ■ , [it, it-i), {r, it) forms a path between {i,j) and (r, /), which means that any 
non-transient node can be reached by any other non-transient node; thus, the set of non-transient nodes 
forms a single recurrent class. Clearly, the vertex set V can be decomposed into a single recurrent class 
and possibly several transient nodes. By re-ordering the nodes, it follows that PF can be rewritten as 
in (|891) (see, e.g., [|27l p. 126]). Furthermore, since Q in (|89| ) is irreducible, it follows that qQ + (1 — q)I 
(where / is the identity matrix) is primitive. It follows that C = qPF + (1 — g)/ has a unique largest 
eigenvalue of value one, i.e., Ai = 1, and 1 > IA2I > ■ ■ ■ > |A„2|. Let (t(C) = {Ai, A2, . . . , A„2}. Then, 



a{C ® C) = {AjAj, z = 1, . . . , 77,, j = 1, . . . , 77,}, including algebraic multiplicities in both cases 11261 
p. 245]. Since Ai = 1 is unique (multiplicity one) and |Aj| < 1, z = 2,...,?t,^, it follows that the 
eigenvalue of C ® C = [qPF + (1 — q)I]® [qPF + (1 — q)I] of largest magnitude also takes value 1 
and is unique. Since C (8> C is column stochastic and Ai = 1 is unique, we know that either C ® C is 
also primitive or it can be decomposed following a permutation of indices to the form [|27l p. 126]: 



L M 
N 



(91) 



where L is a primitive matrix and limfc^oo N'^ = 0. 

We will show next that U = C ® C + q{l — q){D ® D)G has exactly the same nonzero entries as 
C ®C and therefore can be decomposed following the same permutation of indices to the form in (|9T| ). 
As argued before, due to the sparsity structure of G in (|79l ). the only nonzero entries of {D ® D)G will 
be in columns {k — l)n^ + fc, = 1, 2, . . . , n^, thus except for entries in the aforementioned columns, 
the nonzero entries of 11 will be the same as those m G ® G . For all other columns in 11 (that include 
nonzero entries in {D ® D)G), it was shown in (|86l) -(l88l) that the nonzero entries of 11 are strictly 
positive, from where it follows that 11 has the same sparsity structure as C (S) C, which means that 11 
can also be decomposed in the form of (|9TI) (for some matrices L', M', N'), and the resulting upper-right 
block is also a primitive matrix. Therefore, 11 has a unique largest eigenvalue at one. ■ 

The following two lemmas establish that the first and second moments of and hk, and and 
converge to the same solution up to a scalar multiplication. These two lemmas will be used to show 

that as A; — 7- 00, the random vector Vk = — ai/k, for a = will converge almost surely to 

Lj=iZ/o(jj 

t> = 0. This suggests that, as A; —t- 00, and whenever Zk is nonzero, each node i can obtain an estimate 
of a = by calculating the ratio yuii) / Zk{i)- We will also show that, in fact, z^ will be larger 

than some threshold infinitely often. 



Lemma 4: The first moments of and (also yk and Zk asymptotically converge to the same 
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solution up to scalar multiplication: 

lim Zk = a lim y^, (92) 
lim bk = a lim ak, (93) 

k—^OD k—^oo 

EJ=i2o(j) 

where a = ^ --. 

Proof: In Lemma [H it was shown that y^+i = 

[qP+{l-q)l]yf, mdzk+i = [qP+{l-q)l]zk with 
Vi = QUo^ and zi = qzQ. Since P is column stochastic and primitive, it follows that [qP+{l — q)I~\ is also 
column stochastic and primitive. Thus, limfc^oo^fc = culiuik^ooyk^ where from the column stochasticity 
property it follows that ELi^fc(j) = g(ELi^o(j)) and ELi^fc(j) = ^lELi^oO')). VA; > 1; this 

y^n zo(j) 

implies that a = ^i^^ which establishes (l92l) . 

By noting that Tiq = Py^, and 6o = Pyo^ and using the fact that P is column stochastic, it follows 
that E"=i«fc(j) = E"=i«o(j) = E"=iyo(j) and E"=Jfc(j) = EJ=Jo(j) = E"=i2o(j)- Since 
qPF + (1 — q)I (i.e., the matrix that governs the dynamics of and bk) is column stochastic and, as 
shown in the proof of Theorem [H has a single largest eigenvalue at value 1 , a similar development to 
the one above can be used to show ( |93] ). ■ 

Lemma 5: Define Wk = b^ — aak and denote by Xk the vector that results from stacking the columns 
of Xfc := Elwhwl]. Then, it follows that 

Xk = nxfc-i, (94) 

4 

with xq = ^o + «^7o - a(^o + "^o) and ^"^^ xo(0 = 0. 

Proof: Since X,, := E[wkwl] = E[bkbl] + a^Ela^al] - a(E[akbl] + E[6fca^]) = ^k + a^^k - 
+ Afc), it follows that Xk = ipk + cn'^lk — + ^fc)- From (1781 ) and subsequent discussion, it 
follows that 7fe = n7fc„i, ipi, = Utp^.i, = n^fc-i, and 4 = n4„i, thus Xk = ^ipk-i + a'^'^lk-i - 
aijl^k-i + n(5fc_i) = n(^fc_i + a^jk-i - + 5fc_i)) = nxfc-i- 
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In Lemma [3l it was shown that Fq = -Pyoyjf -P^? = PzqZqP'^, and Sq = Py^z^P^ = A.^. Since 
7o, V^oj 'Co» and (5o result from stacking the columns of Tq, "^q, Eq, and Aq, it follows that 

E^oW = EEro(^,j) = I E^o(^) 1 ' (95) 



1=1 i=l j=l 



E^o(^) = EE^o(^'^')= (E"o«) , (96) 

1=1 i=l j=l 



„4 



E^o(o = EE-o(^, 

1=1 i=i j=i 



J) 




E'5o(o = EE^o(^'^')= lE^oW) |E2/o(^))' (97) 

where the last equality is obtained by taking into account that i) matrix P is column stochastic by 
construction, and ii) for any a,b E M", we have that XlILi Sj=i '^b'^ihj) = iYl?=i '^^(S/Li bi)- Since 

« = ilrSuy' it follows that Er=i Xo(0 = Er=i(^o(0 + a'loil) - a^l) + 5o(0)) = 0- ■ 
Theorem 2: Let and be the random vectors that result from iterations (I28l)-(l29l) and (l30l)-(l3T1). 
Define Vk = Zk — ayk, where a = Then, ||ffe||oo — ^ almost surely. Furthermore, for every 

j\ ^fc(j) — )■ as /c — > oo almost surely (i.e., for every j, limk^ooVk{j) = with probability one). 

Proof: The result follows from the first Borel-Cantelli lemma [|28l Theorem 7.3.10]. For all A; > 
and all e > 0, define the event Ek{e) = {||fA,||oo > e}- We will first establish an upper bound on 
EZoP^{Ek{e)} by noting that Pr{i?fc(e)} = Pr{||i;fc||oo > e} < thus ET=oP^{Ek{e)} < 

iEr=oE[||t;.||oo] < iEr=oE[||^.||2]. Note that E[||^,|h] = (EK^.;,]))^^ = (trace(E[^,^f ])) = 
(trace(E[2fc2;J]) + ahrace{E[ykyl]) - 2atrace(E[?/fc2;^])^/2. We will next show that E[||ufe||2] as 
A; — )■ oo geometrically fast. Using Lemma|3l it can be established that E[ffct;J] = 'E[zkzl]+a'^ E[yfcy^] — 
a{E[ykzl] + E[zkyl])) = ^['z'^fc-i + 'Zll " g)diag(Xfc_i)]F^ where Xk^i = E[wkwl] as defined in 
Lemma[5l thus the evolution of E[ffef is governed by the evolution of Xk-i or by Xk~i (the vector that 
results from stacking the columns of Xk^i). In Theorem [H we showed that FI has a unique eigenvector 
(with all entries strictly positive) associated to the largest eigenvalue Ai = 1. Then, the solution of (|941 ) is 
unique and equal to this eigenvector (up to scalar multiplication). Since H is a column stochastic matrix, 

4 4 

and Lemma |5] established that X]r=iXo(0 = 0' it follows that YM=iXkil) = 0,k > 0, and therefore 
limfe^ooXA:(0 = O5 Additionally, it is well-known that the convergence of (|94|) is geometric with a 
rate of convergence given by the eigenvalue A2 of FI with the second largest modulus, which satisfies 
IA2I < Ai = 1 (see, e.g., [l27|)- Thus, we have established that Xfc(0 ~^ 0? geometrically fast, 
from where it follows that all the entries of E[wfcwJ] go to zero also geometrically fast. Therefore, the 
trace{E[vkv'^]) also goes to zero geometrically fast, so that E [11^^112] also goes to geometrically fast. 
It immediately follows that Xlfclo ^ [|kfc||2] < 00 and therefore Xlfclo P^ill^^lloo > e} < C)0. Then, 
from the first Borel-Cantelli lemma Pr{||t>fe||oo > e infinitely often} = (or Pr{||ffc||oo > e i-O.} = 0). 
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Finally, since, for every j, \\vk\\oo > \vkij)\, then, for every j, Pr{||i;fc||oo > e} > Pr{\vk{j)\ > e}, and 
thus, for every j, Xlfclo > e} < c>o. Then, by Theorem 7.2.4.C of |[28ll . for every j, Vk{j) — > 

almost surely. ■ 

Theorem [2] has established that, in the limit as the number of iterations k becomes large, the values 
of vectors and will be perfectly aligned so that Zk — ayk = with probability one. Thus, in this 
limiting case, each node j can calculate the value of ^ by taking the ratio as long as Zk{j) ^ 0. 
Note that, as also evidenced by the simulations provided for the small network of Fig. \T\ (e.g^ the plots 
on the left and in the middle for Figure [3]), the vectors yk and z^ do not converge in any wayjfl however, 
the values yk and Zk become perfectly aligned (with probability one), allowing each node j to calculate 
a ~ zfelil' ^^^^ problem here arises when yk{j) and Zk{j) have both value zero, which does not 
constitute a violation of z^ — ayk = 0, but clearly does not allow node j to calculate the desired value 
^. This is evidenced also in the simulations provided for the small network of Fig. [T] for example, in 
the plots in Fig. |4l the values of ykU) and Zk{j) often go to zero (simultaneously) leaving their ratio 
undefined^ The next two theorems essentially establish that Zk{j), j = 1, 2, . . . , n, will be greater than 
zero (in fact, greater than a constant C that will be specified) infinitely often. Note that, in subsequent 
developments, Zk{j) is denoted with Zj[k] in order to remain close to the notation in (fT5l)-(fT7l). 

Theorem 3: Consider a (possibly directed) strongly connected graph Q = (V, £) and the iteration 
in (fT5l)-(fT7l). where Xji[k], E £, k = 0,1,2,..., are independent identically distributed (i.i.d.) 

indicator R.V.'s as defined in ([T]), i.e., Xji[k] = 1 with probability q and Xji[k] = with probability 
1 — g, independently between E £ and independently for different k. For every j = 1,2, ... ,n, 
define the event El = {zj[kn\ > C}, k > 1, where C = , I^L. = maxjev{l?/}, 

n = |V|, and m = \£\. Let (I denote the indicator of the event El, k > 1, i.e., (^l = 1 whenever 
El, k > 1 occurs, and Cfc = otherwise. Then, whatever Ci, C2, • • • , Cfc-i, we have that 

FT{z,[{k + l)n] > C I CI CLi, . . . , C^} > g^ Vj. (98) 

Proof: Note that the iteration in (fT5l) to (flTI) involves nonnegative quantities: since for every j, 
Zj[0] > 0, Vj, it follows from (l30l)-(l3T1) that, for every j, Zj[k] > 0, A; > 0. Then, it is not hard to 
establish that the total mas^ -Mk+i in the system, defined as 

n 

^Earlier, we established that, for large k, the quantities E[yk], E[zk], E[ykyk], E[zkz'^] and Ely^zJ^] converge, but this does not imply 
any convergence for the values of or z^.. 

*Since in the simulations for the plots in Fig. [4] each packet (including self-packets) can be dropped with probability 1 — g at iteration 
k, there is a nonzero probability that all packets destined for node j will be dropped, causing both of its values at the next iteration 
(yk+iij) and Zk+i{j)) to be zero. For instance, in the simulation of Fig.|4] 2^(1) will be zero with probability at least (1 — q)^ — 0.81 
because node 1 will have value zero if both packets destined for it (including the self-packet) are dropped. 

'This notion is discussed in great detail in Part II of this paper. 
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satisfies 

Mk+i = n , for all A; = 0, 1, 2, ... . 
[This follows from the fact that Mq = J2^=i = ^ ^^'^ the observation that 

n 

Mk+i := + J] (a,,[A;]-r,aA;-l])(l-a;,,[/c]) 

= {aji[k] - Tji[k - l])xji[k] + {aji[k] - Tji[k - 1])(1 - Xji[k]) 

= E ('^Ak]-r,,[k-l]) 

= J2 ( ^ji - 1] + [k] - (^ji [k - A^ji - 1] - \k-2\{\- Xji [k-1]) 

n 

= 5Z (^.■a^-l]-^,.[^-2])(l-a;,aA;-l]), 

which is equal to M.k-] 

The definition of M.k+i in (l99l) involves the summation of n + m nonnegative quantities, namely, 
+ 1] for j = 1, 2, n and mji[k + 1] := {(Tjiyk] —Tji[k — 1])(1 —Xji[k]) for (j, i) e £. We can think 
of these quantities as follows: Zj[k + 1] is the mass at node j, whereas 'mji\k + 1] is the mass waiting to 
get transferred to node j from node i. Since all of these quantities are nonnegative, at least one of them 
is larger or equal to Regardless of whether this quantity is associated with a node (say node j*) 
or a link (say link (j*, i*)), this mass has at least one way of reaching any node i of interest in graph 
Q via a path of length at most n — 1 (because the graph Q is strongly connected): in particular, there 
is at least one path of length at most n — 1 from node j* to node i and all the links in this path have 
weight at least If all these links are activated, which occurs with probability (g" in the case 
of link because the mass needs to first transfer to j*), then a fraction (— ^)"~^ of the mass 

will transfer to node i in at most n steps. Then, since for every j, Zj[k] > 0, k >0, independently of 
the values of Zj[ln\, I = 1,2, ... ,k, FT{zj[{k + l)n] > C \ d, (l-v • • • ' Ci} > obtains, whatever 
Ci, C2, . . . , Finally, for every j , FT{z,[{k + l)n] > C \ Cl CLi, • • • , Cf} = 1 - Pr{^,[(A; + l)n] < 
C I ClCi-i.---.Ci} < l-Pr{z^[{k + l)n] = Q\ Ci,CLi,...,Ci} < l-g''^^ where I?7 is the in-degree 
of node j. ■ 

Given a sequence of events Ei, E2, ■ . . , E^, . . . defined on some probability space, the next theorem 
(which we do not prove) states the 1912 Borel criterion for establishing whether the event that infinitely 
many of the E^ occur, denoted by {E^ i.o}, will occur with probability one or zero (see, e.g., [29] , 
[|30l ). This result, together with the result in Theorem [3] will be used to establish that, for every j, the 
event E-^ = {zj[kn] > C}, k > 1 occurs infinitely often. 

Theorem 4: Let {Ek}, A; = 1, 2, . . . , be a sequence of events defined on some probability space. Let 
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(k be the indicator function of the event Ek. Let Fr{Ek+i \ (k, Ck-i, • • • , Ci} denote the conditional prob- 
ability of the event E^j^i given the outcome of previous trials. If < p'f, < Pi{Ef,+i \ Ck, Cfc-i, • • • , Ci} ^ 
p'l for every k, whatever d, C2, • • • , Ck, then i) PrjEfc i.o.} = if J^kLiP'k — ^^'^ i-O-} = 1 

if Er=iK = oo- 

Theorem 5: Consider a (possibly directed) strongly connected graph Q = (V, £) and the iteration 
in (fT5])-(fT7]). For every j = 1,2, ...,n, define the event El = {zj[kn] > C}, k > 1, where C = 
^iax = niaxjevi'D^}, n = \V\, and m = \£\. Then, Vi{Ek i.o.} = 1. 
Proof: Theorem [3] established that, for every j, Vi{zj[{k + l)n] > C \ CL Cl-i^ ■ ■ ■ ^ Ci} ^ 
Define = g", then it follows that ^^^Pfc = 00, and by the second assertion in Theorem H we 
conclude that, for every j, Pt{EI i.o.} = 1. ■ 

The final piece is to establish that whenever zj [k] > C, which occurs infinitely often, each node will 
be able to calculate an estimate of I; by calculating the ratio yj[k]/zj[k] and this estimate will converge 
to 1 /a as A; goes to infinity. 

Theorem 6: For each j, let k = ti, t2, ■ ■ ■ be an increase sequence of time steps for which Zj[k] > C . 
Then, almost surely 

a 



lim 

n— >oo 



Zj \tn\ 



0. 



(100) 



Proof: Since Zj[k] > C for k = ti,t2, 
of Theorem [3l we established that M.k = 



, it follows that 



i/j[<n] 



1 



> 



ayj[tn\-Zj[tn] 



< Also, in the proof 

,k>0, from where it follows that zj [tn] < n, therefore 



^ a — • III Theorem [21 we established that — Zj[k] \ — almost surely, which 

implies that the subsequence — Zj[tn]\ — )■ almost surely, then since C < n, we have that 



lim 

n— >oo 



Vjitn] 1 


< lim 


Zj[tn] a 





a 



Vj [^n] Zj [t^ 



aC 



(101) 



almost surely. 



VI. Concluding Remarks 

In this paper, we proposed a method to ensure robustness of a class of linear-iterative distributed 
algorithms against unreliable communication links that may drop packets. We used statistical-moment 
analysis and the Borel-Cantelli lemmas to establish the correctness of the proposed robustified algorithm. 
In Part II of this paper, we establish similar convergence properties by recasting the problem as a finite 
inhomogeneous Markov and using coefficients of ergodicity commonly to used in analyzing this type 
of Markov chains. 
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